Parallel Algorithms for Linear Approximation on Distributed Memory Machines

نویسندگان

  • Yongwha Chung
  • Viktor K. Prasanna
چکیده

(Summary of Results) Abstract In this paper, we summarize our results in paralleliz-ing the linear approximation step on current distributed memory machines. We rst analyze the features of current distributed memory machines and the problem characteristics to understand the overheads in parallel solutions to the problem. Based on these, we propose an asynchronous algorithm which enhances processor utilization and overlaps communication with computation by maintaining algorithmic threads in each processing node. Our implementation shows that, given a 512512 image, the linear approximation task can be performed in 0.015 seconds on a SP-2 having 64 processing nodes and in 0.032 seconds on a T3D having 32 processing nodes. A serial implementation takes 0.445 seconds on a single processing node of SP-2 and 0.779 seconds on a single processing node of T3D. Experimental results on various sizes of images using 4, 8, 16, 32, and 64 processing nodes are also reported.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Parallel algorithms in linear algebra

This paper provides an introduction to algorithms for fundamental linear algebra problems on various parallel computer architectures, with the emphasis on distributed-memory MIMD machines. To illustrate the basic concepts and key issues, we consider the problem of parallel solution of a nonsingular linear system by Gaussian elimination with partial pivoting. This problem has come to be regarded...

متن کامل

Fast Priority Queues for Parallel Branch-and-Bound

Currently used parallel best rst branch-and-bound algorithms either suuer from contention at a centralized priority queue or can only approximate the best rst strategy. Bottleneck free algorithms for parallel priority queues are known but they cannot be implemented very eeciently on contemporary machines. We present quite simple randomized algorithms for parallel priority queues on distributed ...

متن کامل

cient Parallelization of Relaxation Iterative Methods for BandedLinear Systems

In this paper we present an eecient parallel implementation of relaxation iterative methods , such as the Gauss-Seidel (GS) and Successive-Over-Relaxation (SOR), for solving banded linear systems on distributed memory machines. We introduce a novel partitioning and scheduling scheme in our implementation which allows perfect overlapping of computation with communication, hence minimizing latenc...

متن کامل

Efficient Parallel and External Matching

We study a simple parallel algorithm for computing matchings in a graph. A variant for unweighted graphs finds a maximal matching using linear expected work and Olog2 n expected running time in the CREW PRAMmodel. Similar results also apply to External Memory, MapReduce and distributed memory models. In the maximum weight case the algorithm guarantees a 1/2-approximation. Although the parallel ...

متن کامل

Mapping Robust Parallel Multigrid Algorithms

SUMMARY The convergence rate of standard multigrid algorithms degenerates on problems with stretched grids or anisotropic operators. The usual cure for this is the use of line or plane relaxation. However, multigrid algorithms based on line and plane relaxation have limited and awkward parallelism and are quite diicult to map eeectively to highly parallel architectures. Newer multigrid algorith...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2007